Predicting the Efficacy of Novel Synthetic Compounds in the Treatment of Osteosarcoma via Anti-Receptor Activator of Nuclear Factor-κB Ligand (RANKL)/Receptor Activator of Nuclear Factor-κB (RANK) Targets

Background: Osteosarcoma (OS) currently demonstrates a rising incidence, ranking as the predominant primary malignant tumor in the adolescent demographic. Notwithstanding this trend, the pharmaceutical landscape lacks therapeutic agents that deliver satisfactory efficacy against OS. Objective: This study aimed to authenticate the outcomes of prior research employing the HM and GEP algorithms, endeavoring to expedite the formulation of efficacious therapeutics for osteosarcoma. Methods: A robust quantitative constitutive relationship model was engineered to prognosticate the IC50 values of innovative synthetic compounds, harnessing the power of gene expression programming. A total of 39 natural products underwent optimization via heuristic methodologies within the CODESSA software, resulting in the establishment of a linear model. Subsequent to this phase, a mere quintet of descriptors was curated for the generation of non-linear models through gene expression programming. Results: The squared correlation coefficients and s2 values derived from the heuristics stood at 0.5516 and 0.0195, respectively. Gene expression programming yielded squared correlation coefficients and mean square errors for the training set at 0.78 and 0.0085, respectively. For the test set, these values were determined to be 0.71 and 0.0121, respectively. The s2 of the heuristics for the training set was discerned to be 0.0085. Conclusion: The analytic scrutiny of both algorithms underscores their commendable reliability in forecasting the efficacy of nascent compounds. A juxtaposition based on correlation coefficients elucidates that the GEP algorithm exhibits superior predictive prowess relative to the HM algorithm for novel synthetic compounds.


INTRODUCTION
Osteosarcoma (OS) is fundamentally characterized as a malignancy of Mesenchymal Stem Cells (MSCs) that initiates from the formation of the bone matrix [1,2].It predominantly manifests in the long bones, with the humerus accounting for approximately 10%, the tibia around 24%, and the femur approximately 52%.Notably, the tumor site frequently presents with symptoms, such as pain, swelling, impaired joint mobility, and skeletal disturbances, with a notable predisposition towards lung metastasis [3].Alarmingly, only 60% to 70% of patients have been found to achieve a 5year survival rate, with a marked predominance in males.Metastatic or recurrent cases exhibit a sobering overall survival rate lingering at 25%.Presently, our understanding postulates that the pathogenesis of osteosarcoma is underpinned by aberrant regulation of mesenchymal cells, anomalies in tumor suppressor gene differentiation, oncogene activation, epigenetic perturbations, and cytokine synthesis.Such processes catalyze the proliferation and multifaceted differentiation of cells, orchestrated by intricate signaling pathways and a plethora of regulatory factors [4].Osteosarcoma's therapeutic landscape is expansive, encapsulating modalities ranging from surgical interventions and chemotherapy to innovative strategies, like molecularly targeted therapy, immunotherapy, gene therapy, embolization, radiofrequency ablation, and stem cell therapy.Nonetheless, it is disconcerting that, despite these advancements, the therapeutic efficacy has remained stagnant for the past three decades [5], accentuating the imperative for novel, potent therapeutic agents.
Grounded in the understanding that osteosarcoma's genesis is intricately linked to osteoblasts [6,7], it is worth noting that the nuclear factor κb ligand-receptor activator (RANKL), an osteoblast secretory product, has been classified within the Tumor Necrosis Factor (TNF) receptor ligand family [8][9][10][11].This molecule avidly binds to the nuclear factor κb receptor (RANK) present on osteoblasts, subsequently fostering their maturation, differentiation, and heightened functional activity.Concurrently, the secretion of Osteoprotegerin (OPG) by osteoblasts plays a pivotal role in osteoclast dynamics, as OPG competitively binds to RANKL, effectively curtailing osteoclast activity.Presently, the RANKL/RANK/OPG axis is lauded as a cardinal mechanism governing bone metabolism [12].Delving into the RANKL/RANK signaling cascade, it emerges that osteosarcoma cells amplify RANKL expression via the secretion of parathyroid hormone-related protein (PTHrP).This upregulation bolsters osteoclast activity, and the subsequent liberation of growth factors during cellular lysis amplifies tumor cell proliferation and PTHrP synthesis, culminating in a detrimental feedback loop instigating lytic bone lesions.Evidently, this signaling cascade offers a promising therapeutic target for osteosarcoma.Notably, a slew of investigations have honed in on this potential, elucidating agents such as RANKL inhibitors (which thwart osteoclastogenesis by suppressing RANKL activity) and non-peptide organic moieties (which deter osteoclastogenesis by competitively binding to RANKL) [13].Yet, these agents are marred by pronounced side effects, underscoring the pressing need for novel drug entities.
In a groundbreaking endeavor, Jiang and Peng [13] unveiled a suite of innovative RANKL/RANK inhibitors.Their seminal work spotlighted a compound (referred to as "1") showcasing marked affinity for RANKL and RANK proteins, as identified through virtual screening.Subsequent Surface Plasmon Resonance (SPR) assessments and Rankldriven osteoclastogenesis assays have attested to the compound's binding prowess and its robust osteoclastogenic inhibitory capacities.This revelation catalyzed the synthesis of 39 analogous compounds, among which compounds 2, 7, 29, and 34 emerged as formidable inhibitors of osteoclast differentiation, while demonstrating minimal cytotoxicity.Moreover, rigorous molecular assays utilizing RT-PCR and WB techniques substantiated that compound 34 notably suppressed the expressions of NFATc1 and c-FOS signaling pathways, which are downstream of the RANKL/RANK nexus, thereby inhibiting the excretion of bone-resorbing enzymes.
To augment our insights into the drug's therapeutic potential, we have leveraged computational paradigms for molecular simulation and drug design.This modern approach eclipses traditional drug discovery modalities in terms of cost-effectiveness and efficiency.Historically, Quantitative Structure-Activity Relationships (QSAR) have been instrumental in dissecting databases, deciphering the quantitative interplay between a molecule's structural intricacies and its experimental attributes.Through adept QSAR modeling, we aim to develop novel, high-efficacy drugs for expeditious experimental deployment [14].

METHODS
Within the confines of this study, a compilation of 39 avant-garde RANKL inhibitor architectures alongside their respective IC 50 values was examined [13] (Table 1).A subset comprising thirty molecular structures was judiciously segregated to create the model, while the remaining nine were earmarked for the creation of the validation or test set.

Calculation of Descriptors
At the forefront of constructing a QSAR modellies the quintessential step of utilizing descriptors to epitomize compounds.Our initial maneuver involved channeling the 39 structures into the ChemDraw platform.A cohort of thirty molecules was chosen at random for the training ensemble, with the subsequent nine designated as the testing cluster.These delineated structures were then integrated into the HyperChem software suite [15,16], whereupon the MM+ molecular mechanics force field, coupled with the semiempirical AM1 or PM3 methodologies [17], was harnessed for optimal structural refinement.Culminating this phase, intricate computations were orchestrated via the MOPAC 6.0 software, yielding three distinct files for every individual structure.

Linear Modeling with Heuristics
For transitioning the MNO files into the CODESSA computational environment [18], the discerning selection of parametric variables emerged as pivotal.This rigorous exercise bore 551 descriptors in total.Employing Heuristic Methodologies (HM) proved invaluable for descriptor sieving [15,19].Certain benchmarks were imperative [20]: (a) extirpation of null or "0" variables; (b) culling of overlycorrelated variables precipitating suboptimal outcomes; and (c) the eradication of variables beset with lackluster correlations amid attributes and said variables.via the HM paradigm, a select quintet of descriptors was crystallized for the synthesis of the superlative multiple linear regression model.These descriptors can be taxonomically classified into five dominant categories: constitutive, topological, geometric, electrostatic, and quantum chemical descriptors.The entire operational methodology is predicated on the sequential augmentation of descriptors, with precision anchored by paramount regression coefficients, namely the R^2 value, crossvalidated R^2 (R^2cv), and the F-statistic (F) value.While R^2cv assays reliability, the triumvirate of R^2, F, and s^2 vouches for model authenticity.The heuristic-centric optimization algorithm is lauded for its lucidity, rapidity, and determinate computational duration.Notably, the intrinsic efficiency of HM, particularly its temporal economy, underscores its preference over alternative methodologies [18].

Gene Expression Programming (GEP) emerges as an evolutionary algorithm, intrinsically inspired by the intricate
nuances of biological gene architecture and functionality [21].The foundational tenets of the GEP algorithm encompass the stochastic generation of a specified quantum of chromosomal entities (dubbed the initial populace).Subsequently, these chromosomes undergo expression, post which the fitness of every singular entity is assayed against a predetermined fitness sample suite (the problem's input criterion).Culminating this, predicated on the ascertained fitness metrics, entities are earmarked for genetic modification, spawning progeny endowed with novel traits.This cyclical process perpetuates across myriad generations, culminating in the identification of an optimal solution.The GEP algorithmic framework is demarcated into quintessential stages: delineation of the coding schema, the fitness function, and the operational function set (+, -, *, /); calibration of control parameters; meticulous crafting of the network operator; and the rigorous establishment of terminal benchmarks [22].

HM Calculations
Through the employment of the CODESSA software suite, 39 molecular compounds were meticulously computed, resulting in the generation of a total of 551 descriptors.An intriguing trend was observed, wherein, with an ascending quantity of descriptors, there was a concurrent increment in the values of R^2 and R^2cv, while s^2 exhibited a persistent decrement.The fluctuations in these values became negligible when the number of parameters approached aptly termed the "breaking point" [23,24].Concurrently, this juncture also satisfied the conditions set by the number of parameters (k) and sample size (n): n ≥ 3 (k+1).

Results of GEP Calculations
In a bid to juxtapose the predictive acumen of HM against GEP, the twin descriptors delineated by the HM protocol were transposed into the APS software infrastructure [20], culminating in the derivation of a nonlinear model.This task was bifurcated into distinct phases: initially, the dataset was stochastically partitioned into a testing ensemble (comprising 30 compounds) and a training cadre (consisting of 9 compounds).Subsequently, the model was engendered by processing the training data, with its reliability vetted against the test set.A tabulation of the pertinent parameters is presented in Table 1.For both the training and testing sets, the R^2 values were discerned as 0.78 and 0.71 respectively, with corresponding mean squared errors being 0.0085 and 0.0121.The equations for the nonlinear model, as manifested within the expression tree, followed suit.

Comparison of GEP and HM Algorithms
As evident in Fig. (2), the correlation coefficients deduced via the GEP algorithm consistently eclipsed those of the HM algorithm within the training subset.Furthermore, the error margins associated with GEP were discernibly more diminutive in comparison to the HM algorithm.This observation held consistent validity not just within the training subset, but also mirrored it in the testing set, wherein the GEP algorithm's error magnitude substantially outperformed that of the HM methodology (Table 1 and Fig. 2).This empirical juxtaposition underscores the formidable modeling prowess of GEP alongside its commendable generalization capabilities, reinforcing its potential as a trustworthy instrument for the prediction of IC 50 values pertinent to pharmaceutical compounds.

DISCUSSION
For a nuanced understanding of the impact of each descriptor parameter on the IC 50 value, a thorough examination of each descriptor is imperative.This meticulous analysis can help facilitate the extraction of critical physicochemical information influencing the IC 50 value, thereby providing invaluable insights for the development of more potent compounds.Within the framework of the HM model, the molecular geometry is intrinsically determined by the number of backbone atoms.In the context of this study, carbon atoms constitute the primary backbone affecting the semi-inhibition rate.Importantly, the architectures crafted by various ar- rangements of carbon atoms wield a discernable influence on the IC 50 value.To illustrate, the molecular structure of methyl imidazole, enriched by an additional methyl group as compared to pyridine, fortifies the lipophilic attributes of the compound.Ionic liquids exhibiting elevated degrees of lipophilicity are consequently more readily adsorbed and aggregated within the cellular membranes of biological entities.Moreover, a direct positive correlation exists between the descriptor KSI and the IC 50 value, signifying an amplification in the toxicity of ionic liquids concomitant with an increase in the descriptor value.RNCS serves as an illustrative descriptor demarcating the spatial distribution of negative charges within ionic liquids, and its significance is inextricably linked to hydrogen bonding capabilities.Given that bacterial membranes inherently possess a negative charge, an augmentation in the negatively charged surface area of ionic liquids induces electrostatic repulsion, thereby inhibiting effective membrane binding.Such alterations in membrane conformation are instrumental in conferring toxicity, as substantiated by extant literature [16].Consequently, in the quest for engineering environmentally benign ionic liquids, novel structures exhibiting reduced toxicity could be realized through judicious modulation of the KSI and RNCS values.

Fig. (2). Comparison of linear and nonlinear models. (A higher resolution / colour version of this figure is available in the electronic copy of the article).
The descriptor RNDB quantifies the relative presence of double bonds in ionic liquids and is categorized as a compositional descriptor.The incorporation of double bonds within either the cationic (e.g., c=c) or anionic (e.g., -s=o) backbone considerably influences the binding affinity of the ionic liquid to the target.Additionally, oxidative cleavage or reduction of these double bonds has the potential to modulate the compound's toxicity profile [17][18].Therefore, RNDB's role as a pivotal model parameter is accentuated by its pronounced impact on the pEC 50 value, revealing a negative correlation with the same.
The inclusion of the descriptor RNDB in the model parameterization is of paramount importance, particularly owing to its pronounced influence on the pEC 50 value associated with Vibrio cyanobacteria Q67.Notably, RNDB manifests a negative correlation with the pEC 50 value, underscoring its salient role in the toxicity modulation of ionic liquids.
The descriptor NFA delineates the number of fluorine atoms incorporated into the ionic liquid and has a significant bearing on the toxicity attributes of the compound.Ionic liquids typically employ carbon atoms as their skeletal backbone and synergize with hydrogen to form hydrocarbon chains.The introduction of fluorine atoms, either partially or completely supplanting hydrogen atoms, culminates in the formation of fluorocarbon surfactants, characterized by fluorine-dominated chains [15].Given that fluorine is the most electronegative element, and the fluorine-carbon bond boasts the highest bond energy among covalent bonds, fluorocarbon structures exhibit elevated stability compared to their hydrocarbon counterparts [19].In the milieu of biological metabolic oxidative decomposition, fluorine-substituted ionic liquids demonstrate superior stability.Additionally, fluorocarbon surfactants possess exceptional compatibility properties, facilitating the diffusion and absorption of ionic liquids into biological entities, thereby enhancing toxicity.These observations collectively elucidate the positive correlation between NFA and pEC 50 values in the HM model; specifical-ly, an increase in fluorine atom count amplifies the inhibitory efficacy of ionic liquids against bioluminescent bacteria.
The descriptor ABOCA is an acronym representing the Average Bonding modality of Carbon Atoms within the ionic liquid.The bonding modes, whether single, double, or triple, between carbon atoms, can vary substantially among different ionic liquids, significantly impacting the ABOCA value.The susceptibility of double and triple bonds to oxidation contributes to toxicological effects, including perturbations in cellular integrity and potential genetic material modifications [20].Literature corroborates the notion that the alkyl chain length in ionic liquids exerts a significant influence on their toxicity.An increase in alkyl chain length augments the lipophilicity of the ionic liquid, thereby enhancing its adsorptive and aggregative propensity on biological cellular membranes.Such a phenomenon facilitates the disintegration of the cellular membrane architecture, culminating in the eradication of the biological entity and a consequent reduction in luminescence intensity.This mechanistic understanding substantiates the role of ABOCA in increasing the pEC 50 value within the HM model.In summary, modulating the lipophilicity of an ionic liquid could serve as a viable strategy for attenuating its inherent toxicity.This investigation presents a robust Comparative Molecular Similarity Indices Analysis (COMSIA) model, distinguished by noteworthy statistical parameters, including a high cross-validated correlation coefficient (q^2) of 0.529, a non-cross-validated correlation coefficient (r^2) of 0.993, and a minimal Standard Estimated Error (SEE) of 0.033.Additionally, the COMSIA contour maps elucidate pivotal structural features within steric, electrostatic, hydrophobic, hydrogen bond donor, and hydrogen bond acceptor fields, thereby imparting invaluable predictive insights pertinent to the understanding of structure-activity relationships in pharmacologically active compounds.
The superior predictive power of this model paves the way for rational drug design endeavors.Leveraging this model, 200 novel 1,8-naphthalimide derivatives have been conceptualized, predicated on the structural foundation of compound 7a.Their IC 50 values have been prognosticated utilizing the aforementioned COMSIA model.Of these synthetically envisioned entities, the derivative 7a.10 stands out, exhibiting both the most favorable IC 50 predictive value and optimal docking affinity with the target DNA molecule.Consequently, the findings of this research offer pragmatic directives for the advancement of novel DNA-targeting chemotherapeutic agents specifically engineered for osteosarcoma treatment.
The outcomes of the present study substantiate that both the Harmony Search (HM) and Gene Expression Programming (GEP) algorithms exhibit formidable predictive capabilities.The HM algorithm offers a distinct advantage of computational alacrity, unencumbered by constraints on dataset dimensions or structural format.It is proficient in swiftly generating optimal regression equations, thereby furnishing an array of models from which researchers may judiciously select.Additionally, its automated parameter selection obviates the potential omission of critical descriptors for ionic liquids.Importantly, it delineates not merely the category of descriptor most germane to the activity values, but also ascertains the statistical significance of said descriptor.Conversely, the Gene Expression Programming Algorithm Package (APS) affords a user-friendly modus operandi, enabling iterative model validation via test sets, which inherently renders the optimization trajectory more intuitive and facile.This algorithm amalgamates the merits of both genetic algorithms and genetic programming, representing individual entities through fixed-length linear codes, thereby streamlining genetic manipulations and enhancing its prowess in resolving complex problems.Comparative metrics reveal that the R^2 and S^2 values associated with the training sets of the GEP model are demonstrably superior to those of its HM counterpart.
Empirical evidence garnered from the test set substantiates the stability inherent in the Gene Expression Programming (GEP) model, revealing it to marginally outperform the Harmony Search (HM) algorithmic method.Intriguingly, the Quantitative Structure-Activity Relationship (QSAR) model conceived via GEP is predicated on a nonlinear function, juxtaposed with the comparatively rudimentary linear function proffered by the HM algorithm [21].While the incorporation of genetic operations, such as mutation, recombination, and inversion alongside functional manipulations elevates the complexity and operational intricacy of the GEP framework, rigorous optimization techniques yield results of notable satisfaction.
In the ambit of this scholarly investigation, both the HM and GEP algorithms have been judiciously employed to probe the architectural nuances of ionic liquids and their concomitant QSAR with respect to the Q67 toxicity exhibited by Vibrio cyanobacteria.The results yielded two distinct predictive models, of which the toxicity prediction schema engendered by the GEP algorithm demonstrated superior efficacy relative to its HM counterpart.
Successful application of the GEP algorithm to the prediction of ionic liquid activity yielded a QSAR model of enhanced precision.This fortifies the assertion that nonlinear models can be efficaciously deployed in the realm of biological QSAR research, producing results that meet rigorous academic standards.Through meticulous investigation of the involved descriptors, a wealth of information has been gleaned concerning the physicochemical structural elements that govern ionic liquid toxicity.This repository of knowledge may serve to illuminate pathways for the future conceptualization and engineering of ionic liquids with attenuated toxicity profiles.

CONCLUSION
In the present study, an integrative approach leveraging both Harmony Search (HM) and Gene Expression Programming (GEP) methodologies has been devised to forecast the ligand-binding affinity of pharmaceutical agents with osteosarcoma-associated molecular targets.Employing a rigorous computational framework to ascertain molecular structural descriptors, linear and nonlinear Quantitative Structure-Activity Relationship (QSAR) models have been constructed via HM and GEP algorithms.The ensuing predictive outcomes have been adjudged to be commensurate with empirical expectations, thereby affirming the methodological efficacy of the study.

HUMAN AND ANIMAL RIGHTS
Not applicable.

CONSENT FOR PUBLICATION
Not applicable.
Fig. (1).The effects of different numbers of descriptors on R 2 , R 2 cv, and S 2 .(A higher resolution / colour version of this figure is available in the electronic copy of the article).
testament to its stability.The intricate specifics of these parameters are illuminated in Fig.(1).